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ABSTRACT 

We present a new method for component separation aimed to extract Sunyaev- 
Zeldovich (SZ) galaxy clusters from multifrequency maps of Cosmic Microwave Back- 
ground (CMB) experiments. This method is designed to recover non-Gaussian, spa- 
tially localized and sparse signals. We first characterize the cluster non-Gaussianity 
by studying it on simulated SZ maps. We the apply our estimator on simulated ob- 
servations of the Planck and Atacama Cosmology Telescope (ACT) experiments. The 
method presented here outperforms multi-frequency Wiener filtering both in the recon- 
structed average intensity for given input and in the associated error. In the absence 
of point source contamination, this technique reconstructs the ACT (Planck) bright 
(big) clusters central y parameter with an intensity which is about 84 (43) per cent 
of the original input value. The associated error in the reconstruction is about 12 and 
27 per cent for the 50 (12) ACT (Planck) clusters considered. For ACT, the error is 
dominated by beam smearing. In the Planck case the error in the reconstruction is 
largely determined by the noise level: a noise reduction by a factor 7 would imply 
almost perfect reconstruction and 10 per cent error for a large sample of clusters. We 
conclude that the selection function of Planck clusters will strongly depend on the 
noise properties in different sky regions, as well as from the specific cluster extraction 
method assumed. 
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1 INTRODUCTION 

The study of the Cosmic Microwave Background (CMB) 
has greatly improved our understanding of the universe in 
the last decade. The measurement and interpretation of 
the CMB power spectrum has determined the most im- 
portant cosmologica l parameters with very high accuracy 
([Spergel et alJ 2003). More experiments, now planned or 
underway, will produce higher resolution multi-frequency 
maps of the sky in the 100-400 GHz frequency range. One 
of the most important new scientific goals of these experi- 
ments is the detection of clusters thro ugh their characteris- 
tic Su nyaev-Zeldovich (SZ) signature (|Sunvaev Zeldovichl 
1980). Because the SZ signal is substantially independent 
of redshift, SZ clusters above a mass threshold will be ob- 
served at very high distances. Such clusters may be used 
to infer cosmological information via number counts and 
powe r spectrum analysis of SZ maps (iMa iumdar & , Mohrl 
200i iLevine et all 120021 : 151 feOOfl : iBattve fc Wellerl |2C 



Well er fc Batt ve 2003). These estimates, however, typically 
assume that all clusters above a given flux are perfectly re- 
constructed and detected in the CMB maps. In practice, 
this may not be the case. SZ clusters have radio intensities 
comparable to other intervening cosmological signals like the 
CMB and point sources, so disentangling them is difficult. 
Moreover, beam smearing and instrumental noise play a role 
in our ability to adequately reconstruct the observed cluster. 
For a given central comptonization parameter y (see below 
for definition), the reconstructed value may also depend on 
the cluster location and shape. Given experimental specifica- 
tions and a reconstruction technique, the associated recon- 
struction error needs to be assessed and accounted for when 
relating cluster observables to cosmological models. This er- 
ror is in addition to the one usually associated with clus- 
ter scaling relations, which play a major rol e in the current 
deter mination of as from galaxy clusters (|Pierpaoli et alJ 
|2C 
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Given the observed maps at different frequencies, clus- 
ters can be detected with two quite different approaches: 

a) a formal component separation is applied to the map, 
and the cluster map is reconstructed along with the maps 
for all other processes. 

This approach was initially developed to reconstruct 
CMB maps. Most often it has been applied in Fourier space, 
with and without assumptions on the Gaussianity of the sig- 
nal (iTegmark fc Efstathiodll996l : IStolvarov et al.ll2002l) . 

b) The maps at different frequencies are first combined 
in an optimal way in order to enhance the cluster signal and 
minimize the other ones. A spatial filter, (typic ally circularly 
symmetric) is then app lied to the final map ([Herranz et alJ 
bOOllDiego et alJl2002ft . 

Clusters of galaxies maps present very specific features, 
in particular: i) clusters are "rare" objects in the map — 
they do not fill the majority of the spa ce; it) The cluster 
signal is non- Gau ssian on several scales (Zh ang et alJ l2QQ2: 
Dieg o et allbOOfl . and we are most interested in the non- 
Gaussianity on the scale of the typical cluster core size; Hi) 
the signal on different scales is correlated. Keeping these 
characteristics in mind, in this paper we develop a method 
for formal component separation of different signals that is 
tailored to better reconstruct the SZ galaxy cluster signal 
from multifrequency maps. Our map reconstruction method 
is wavelet-based and it can take into account the specific 
non- Gaussianity expected for a given signal (SZ clusters in 
particular). We will see that the combination of these two 
features enables us to better reconstruct the intensity of the 
cluster center, which is essential to reliably relate the SZ sig- 
nal to theoretical quantities in order to derive cosmological 
parameters. 

Wavelets have been applied to the analysi s of multifre- 
quen cy maps to characterize specific s ignals 

| 2004l) and extra ct point sources (|Cav6n et alJ |2j 

IVielva et al1l200lh . to recover the CMB sky with Maxi- 
mum Entropy methods (|Vielva et al.ll200ll : |Maj^inger^^aL 
2004) and to assess statistics of the CMB (|Tejmrio^_et_aL 
I l99flt iHobson et a Tl ll999l : ICavon et al 1 l200ll : lAghanim etaL 
2003). Here we focus on galaxy clusters, adopt a suitable 
wavelet basis, and develop a new Bayesian estimator ap- 
propriate to the reconstruction of the non-Gaussian signal 
associated with cluster maps. 

Future SZ surveys may be very different in nature. 
Space-based Planck will cover the whole sky, seeing all the 
most massive clusters but with relatively low resolution. A 
number of ground-based experiments will cover a smaller 
area with higher resolution. Because the performance of the 
reconstruction method also depends on the experimental 
characteristics, we apply our method to two different SZ ex- 
periments, the Planck surveyor and the Atacama Cosmology 
Telescope (ACT). The main purpose of this paper is to in- 
troduce our new estimator, on which component separation 
is based, and to test and compare it with standard tech- 
niques. After reconstructing the SZ maps, we assess which 
error is involved in the determination of SZ observables from 
the map. The analysis of simulated maps allows us to design 
observables suited to derive cosmological parameters for a 
given experiment. We will also assess the impact of instru- 
mental limitations in recovering the cluster maps. We then 
assess the level of completeness and purity of the surveys for 
different flux cuts. Throughout the paper we compare our 



method with the standard multi-frequency Wiener filtering 
technique applied in the same wavelet space. 

This paper is organized as follows: in section |2 we de- 
scribe the relevant signals at radio/infrared frequencies and 
the characteristics of the experiments that we are going to 
consider; we then describe our wavelet basis and reconstruc- 
tion method in section |3 section 0|is dedicated to the results 
and section |5]to the conclusions. 



2 ASTROPHYSICAL SIGNALS AND 
INSTRUMENT CHARACTERISTICS 

We will consider experiments that will provide a map of the 
sky in the frequency range 100-400 GHz. In this range we 
expect to observe several galactic and extragalactic signals, 
like the synchrotron and dust emission from the Galaxy, 
the CMB, radio and infrared point sources and SZ galaxy 
clusters. We are interested here in the reconstruction of the 
cluster signal. Because the galactic signal has a very dif- 
ferent spatial structure from SZ clusters, we assume here 
that the Galaxy is not a fundamental limitation in recon- 
structing SZ cluster maps, which is likely true in substantial 
portions of the sky. Point sources, particularly dusty star- 
forming galaxies at high-redshift which shine brightly at sub- 
mm frequency, may be a potential concern. The modeling of 
source number counts, frequency dependences, and spatial 
correlations remain uncertain. We prefer to leave them out 
of the analysis at the present time, and test our technique 
in absence of point sources. For Planck this approach is jus- 
tified by the presence of higher and lower frequency chan- 
nels, which will hopefully allow modeling and subtraction of 
point sources before component separation. For ACT, which 
lacks thes e channels, this is probably a too optimistic as- 
sumption (IWhite fc Maiumdarll2004l : lHuffenberger fc SeTiakl 
2004). However, we use it for simplicity in testing our new 
estimator, which in any case we find to be superior to Wiener 
filtering techniques used previously. We will assess in future 
work the contamination of such sources in the context of the 
technique presented here. 

We simulate the CMB with Gaussian random fields us- 
ing a power spectrum derived fr om the best-fitting WMAP 
parameters ([Bennett et al.ll2003l) . 

In the following we review the SZ cluster signal and 
describe the characteristics of the simulated maps and of 
the experiments we are discussing here. 

2.1 Clusters of Galaxies 

CMB photons traveling from the last scattering surface to 
the earth interact with the high energy electrons in inter- 
vening massive galaxy clusters. As a consequence of this 
scattering, the CMB temperature and intensity is modified 
in the direction of a cluster. This is known as the thermal 
and kinetic SZ effects. The effect related to the electron's 
thermal motion causes a change in the CMB intensity 51 in 
the direction n of the cluster: 



61(h) = -2y(n)I Si[x(v)], 
where 



x exp(xj 
(exp(x) — l) 2 



x exp(x) + 1 
2 exp(x) — 1 



(i) 



(2) 
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with x = hiy/ksTcMB 
2(k B T C MB) 3 /(hc) 2 and 



I//57 GHz, I 



Experiment v (GHz) FWHM (arcmin) a(fiK) 



y[h) 



m e c 2 J 



n e (ln)kBT e (ln)dl = 



m e c : 



f 

Jo 



P e (ln)dl(3) 



is the comptonization ^/-parameter which depends on tem- 
perature (T e ) number density (n e ) through the pressure (P e ) 
of free electrons. The integral is over the proper distance I. 
The corresponding temperature change reads: 



STcMB 
TcMB 



-2y(n) 



x exp(x) + 1 
2 exp(x) — 1 



(4) 



Note that in the Ray leigh- Jeans limit (x <C 1) we have 
STcmb/Tcmb = SIu/Iu = -2y(n). The thermal SZ sig- 
nal causes a decrement (increment) of the intensity below 
(above) the characteristic frequency of 217 GHz. The ther- 
mal SZ is observable between 10 and 800 GHz, with a mini- 
mum (maximum) around 145 (350) GHz. Because of this pe- 
culiar frequency dependence, multi-frequency observations 
will be crucial in recovering the cluster SZ signal. Massive 
clusters have typical y parameters of the order of 10 -4 cre- 
ating an SZ signal of the same order of magnitude as that 
of the CMB fluctuations. 

The interaction of free electrons in galaxy clusters with 
the CMB photons is also responsible for the kinetic SZ. Clus- 
ters of galaxies have a peculiar motion with respect to the 
Hubble flow. As a consequence, the scattered CMB photons 
are subject to a Doppler effect due to the bulk cluster mo- 
tion. The kinetic SZ is weaker than the thermal SZ and it 
has a similar frequency dependence to the CMB, therefore it 
is harder to detect and separate from the CMB. Very likely 
the kinetic SZ will be measured after cluster positions have 
been determined. We therefore ignore it in the component 
separation technique developed in the following sections. 

2.2 Characteristics of future CMB experiments 

The performance of any component separation technique 
also depends on the characteristics of the experiment in 
hand. Future SZ surveys will have a broad spectrum of possi- 
ble characteristics in term of frequency coverage, sensitivity 
and resolution. For this reason we specify here our analysis 
for two future CMB experiments very different in nature: 
Planck and ACT 1 . The specifications for these experiments 
most relevant for SZ surveys are summarized in tabled 

Planck is an all-sky experiment with a quite broad beam 
(5 arcmin) if compared to the typical cluster size (1-10 ar- 
cmin), therefore we expect Planck to detect the most mas- 
sive (or extended) clusters only. The large area covered, how- 
ever, will allow to detect a sizable number of them. 

ACT is a higher resolution, ground based experiment 
where most massive clusters are larger than the beam. At 
this resolution, clusters appear aspherical (see fig. first 
panel) and the challenge here is also to be able to individ- 
ually detect merging structures and resolve the outskirts of 
massive clusters. Moreover, we want to find the many small 
clusters which may be confused with noise (or other point 
sources) . 

In our simulations of both experiments, the detection of 



Planck 


143 


7.1 


6 




217 


5.0 


13 




353 


5.0 


40 


ACT 


145 


1.7 


2 




217 


1.1 


3.3 




265 


0.93 


4.7 



Table 1. The characteristics of the Planck and ACT experi- 
ments at the frequencies used in this work. The RMS detector 
noise per full-width-half-maximum pixel, labeled cr, is given in 
thermodynamic temperature units. 



the cluster central emission is going to be challenged by the 
following factors: i) smearing by the instrumental beam; ii) 
instrumental noise in) confusion with the CMB anisotropy 
structure. 

In the following we assess how precise the reconstruc- 
tion of the central emission in these experiments is. In or- 
der to do so, we constructed simulated maps of the sky at 
different frequencies for both experiments with CMB and 
SZ cluster maps. For Planck, t he 10 x 10 sq uare- degree SZ 
cluster maps where taken from IWhitel (|2£)03) 2 . For ACT we 
use 1.19 x 1.19 degrees maps obta ined from hydrodynamical 
simulations by IZhang et alJ (|2002f) . For both experiments 
the reconstruction was based on three frequency channels, 
which are specified in tabled 



3 THE RECONSTRUCTION METHOD 

In this section, we present the method used to reconstruct 
the different processes from the observed maps. Instead of 
the usual Fourier space decomposition of the signal, we 
adopt here a wavelet decomposition. We perform a Bayesian 
least square estimation of the different processes, modeling 
the statistics of wavelet coefficients of the signals by Gaus- 
sian scale mixtures. The estimation can be thought of a 
weighted local Wiener filters on neighborhoods of wavelet 
coefficients, where the weight is determined by the specific 
non-Gaussianity under consideration. Below, we describe 
the details of the wavelet choice, and of the reconstruction 
method. 



3.1 The wavelet decomposition used 

Since we deal with sky maps, we are interested in two- 
dimensional wavelets. In the choice of the wavelet to use for 
our image processing, we have to balance the orthogonal- 
ity of the wavelet (which is desirable in order to well-define 
the statistics) with the compatibility of the wavelet with the 
image at hand. Orthogonal wavelet bases, such as the 2-D 
Daubechies wavelets, are typically heavily biased towards 
horizontal and vertical directions; moreover, they are usu- 
ally not very well concentrated in frequency. To avoid this, 
we use an overcomplete wavelet representation inspired by 
the work of IPortilla et alJ ([2003) which is more adequate to 
the analysis of astrophysical images. 



http:/ / planck . esa. int / , |htt p : / / www . hep . upenn . edu / act / index . ht ml | 2 http:/ /pad. berkeley.edu/tSZ/PlanckSZ/ 
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Wavelet power spectra, Planck 143 GHZ 
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Figure 1. Top row: wavelets in space; Bottom row: wavelets in 
Fourier plane. First column: wavelet at a fine scale j + centered 
at location qo, oriented along the first diagonal. Second column: 
wavelet at a coarser scale j, centered at location qi, oriented along 
the first diagonal. Third column: wavelet at the same coarser 
scale j, centered at location q2, oriented along the horizontal axis. 
Fourth column: scaling function, centered at location qi. 



The wavelet decomposition of 
reads: 



signal s in our case 



(5) 



3=0 m=l q£2-3 Z 2 



where <j) q are the scaling functions, ipj, m , q are the wavelets 
and (,) are scalar products. The sum ^ q {s,(i)q)(j)q is the 
projection of s on the coarsest scale, i.e. a low-pass version of 
s. Each scaling coefficient (s,<p q ) contains information about 
the signal s at the coarsest scale and at a specific location 
in space q. For j fixed, the sum ^ m ^ q {s,ipj,m, q )ipj,m, q is 
the projection of s on the scale j, i.e. a band-pass version of 
s. Each wavelet coefficient (s,i/jj,m,q) contains information 
about the signal s at the specific scale j, orientation m and 
location in space q. 

As usual with wavelet transforms, changing scale is done 
by dilating, and changing location is done by translating the 
wavelet: 

1pj + l,m,q(x) = 2lpj,m,q( < 2x) lpj,m,q(x) = 1pj,m,o(x ~ q) (6) 

Hence, scale j + 1 corresponds to a spatial frequency band 
that is twice as wide and for which the central frequency is 
twice as large as that of scale j. On the other hand, in space, 
the wavelets at scale j + 1 are better localized than at scale 
j since they are more narrowly concentrated around their 
center q (see Fig. Q column 1 and 2). 

Unlike the 2-D Daubechies wavelets, the wavelets (and 
scaling function) we use here are defined in the Fourier plane. 
This ensures that they are well concentrated in frequency. 
Moreover, it enables us to introduce orientation by rotating 
the Fourier transform of the wavelet (see Fig. Q column 2 
and 3). If / is the Fourier transform of / and (r, 6) are polar 
coordinates, then: 



ipj,m, q (r,6) = ipj,o, q (r,0 - — ) 



(7) 



The transform is therefore close to rotation invariant and 
computation is fast via FFT. 

The Fourier transform of the wavelets and scaling func- 
tions are 



Figure 2. Left: Power spectra in Fourier space of the wavelets 
at different scales (dash-dot) and signals : CMB (plain), clusters 
(dashed) and noise (plain with circles) for Planck at 143 GHZ. 
Note: here we only index the scale of the wavelet. Right: corre- 
sponding Power spectra in wavelet space for the CMB (crosses), 
clusters (plus) and noise (circle) for Planck at 143 GHZ. 



for j > 0, and < m < M, where the low-pass filter L(r), 
the high-pass filter H(r) and the oriented filters Gm{0) are 



L(r) = cos(- log 2 (r))£i< r <2 + S r <i 
H(r) - sin(- log 2 (r))£i< r <2 + S r >2 



(9) 



G M (0) 



(M — 1)! 



y/M[2(M-l)]\ 



|2cos0r 



The set of all wavelets and scaling functions determines a 
redundant system (they are linearly dependent), however, 
the Plancherel equation holds: 



|.|£. = £ 

9 GZ 2 



i(«.wi 8 +EE E i< s 'Vwi 2 (io) 

j=0 m=l q e2-3 Z 2 

This ensures perfect reconstruction (eq. |5J. 

Given an image s, it is possible to define a wavelet power 
spectrum P w , which is related to the Fourier based power 
spectrum P(k) = \s(k)\ 2 : 



Pv,(j) 



^ |(s,Vj,m,9>| 



E 



(ii) 



The Fourier power spectra (in /xK 2 ) for the relevant sig- 
nals (CMB, SZ clusters and noise) are displayed in Fig. El 
(left panel) together with the windows of wavelets at differ- 
ent scales. The right panel shows the corresponding wavelet 
spectra. 

3.2 The estimator 

Formally, our goal is to estimate several processes (CMB 
and SZ) from their contributions in observations at different 
frequencies. We estimate the processes {x(p,iso)} p from the 
observations {y(v)} u given that: 



(12) 



b (r,6) = L(2r, 



(8) 



where x(p, z/o) is the template of the pth process at a given 
frequency z/o, f(p, v) is the frequency dependence of the pth 
process, B(y) is the beam (assumed to be a Gaussian with 
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given FWHM) and N(y) is the frequency dependent white 
noise. 

In wavelet space, these equations look similar: the ob- 
served wavelet coefficients {y(y)^j,m,q) — yj,m,q(y) are lin- 
ear combinations of the wavelet coefficients of the different 
processes Xj> >m / z/o). 

Our estimation method will rely on two principles to 
discriminate the contributions from different processes. The 
first one is that we know some statistical properties of the 
processes (e.g. the CMB and noise are Gaussian processes, 
while the clusters are not). The second one is that some spa- 
tial properties of the processes can be captured by modelin g 
the coherence of wavelet coefficients ([Romberg et al.l Eo03'). 
For example, clusters can be described as spatially localized 
structures with a high intensity peak. Hence if a cluster is 
centered at location go, one should see rather big wavelet 
coefficients around this location through different scales. If 
there is no cluster there, all these coefficients should be fairly 
small. This would not be the case for the noise. We aim to a 
local reconstruction which can take advantage of these corre- 
lations. In order to estimate a particular wavelet coefficient 
Xj,m, q , we consider a neighborhood of coefficients around it: 

X = 'Kj^m^q — ( Xj^rn,q 3 3?j',m,q+l , Xj^rn,q — 1 3 %j — l,m,q (13) 

such neighborhood contains wavelet coefficients at the same 
scale with close location, and at a close scale with the same 
location. 

Furthermore, we choose to st atistically describe the sig- 
nal as a Gaussian scale mixture (Portilla et al. 2003): 



(14) 



where u is a centered Gaussian vector of the same covariance 
as x (C x — C u ), the multiplier z is a scalar random vari- 
able (whose distribution we prescribe later) and the equality 
holds in distribution. The variables u and z are independent 
and E{z} — 1. The covariance of x captures the spatial co- 
herence of the process, while the (non-)Gaussianity of the 
signal is captured by the distribution of the multiplier z. We 
will take the covariance C x to be a function of the scale j 
and orientation m. The convenience of this signal's descrip- 
tion will be more clear in the next section. 



3.2.1 Step one: de-noising one observation of one process 

To illustrate the idea for the reconstruction process, let us 
first consider the simple case where we observe one process 
polluted by noise: y(yo) = x(iso) + N(i/q). The equations for 
each single wavelet coefficient and for the neighborhood of 
wavelets coefficient read: 

yj,m,q — %j,m,q + ^j,m,q (15) 

y = x + N 

where N is Gaussian and x is described with the Gaussian 
mixture in eq. 1141 The convenience of this representation is 
that for a fixed multiplier z = Zo, the Bayes least square 
estimate of x given the observed vector y and z, would be 
a Wiener filter on the neighborhood of wavelet coefficients: 

£{x|y, z = z } = zoC x (zoC x + C N ) _1 y- (16) 

However in our model z is not a constant, so i?{x|y}, the 
Bayes least square estimate of x, is a weighted average of 
the Wiener filters above: 



£{x|y} 



p(z = z \y)E{x.\y, z = z } dz 



(17) 



The weights are determined by the probability of z given 
the observation y, noted p(z = zo\y), which is computed via 
Bayes rule: 



pO so |y) 



p(y\z zq)pz(zq) 

J P(y\z = z')p z (z')dz' 



(18) 



where p(y\z = z') is a centered Gaussian vector of covariance 
z'C x + Cn, and p z is the probability distribution of z which 
we will describe in 13.31 

Following this procedure, one gets an estimate i?{x|y} 
for each neighborhood of coefficients x. From this estimated 
vector, we only keep the estimate of central coefficient Xj, m , q . 



3.2.2 Step two: de-blurring one observation of one process 

Consider now the case where the observed signal is a blurred 
version of the original: y(yo) — x(vq) * B{vq) + N(vq). The 
convolution with the beam correlates the signal spatially. As 
a result, a single observed wavelet coefficient is dependent 
on many wavelet coefficients in the signal. The equations do 
not decouple any more: 



yj,r> 



23 



+ N, 



(19) 



Since support of the beam is infinite, every wavelet co- 



efficient Xa 



contributes to yj, m , q - However the biggest 



contribution come from the wavelet coefficients at a close lo- 
cation (\q — q'\ small). Hence if the size of the neighborhood 
is big enough, one can make the following approximation: 



yj,m,q — BjX , ? 



+ N,- 



(20) 



where Bj is a matrix which depends on the beam B and 
the scale. Eq. (fT7)> and (fT£Jl hold with the modified Wiener 
filter: 



£{x|y,z = z } = zoC x B;(zoBjC x B; + Cp 



(21) 



and the covariance of p(y\z = z') is now: ^BjCxBJ + Cn- 
The matrix Bj is a truncated version of the matrix of 
convolution by the beam projected at scale j. For eq. (Il2( H to 
be a good approximation, we allow the size of the neighbor- 
hood of wavelet coefficients to vary with the scale. Specif- 
ically, we extend the neighborhood of eq. (|13f> so that we 
capture 90% of the power of the beam at each scale. To do 
this, we simply include in Xj,m,q the coefficients Xj jrn ^ q /, with 

\q -q\ < k, choosing k so that J2\ q >\ <k l S (fr)| 2 > °- 9 - Note 
that the coarser the scale, the smaller k is. 



3.2.3 Step three: de-mixing several processes from several 
observations 

One can extend this procedure to the case where several pro- 
cesses contribute to signals observed at different frequencies, 
as in equation (|12f> . 

For each process x(p, z/o), each neighborhood of wavelet 
coefficients is modeled as a Gaussian scale mixture: 



lip) = \/ Z 3,rn,q(p)\lj,m,q(p) 



(22) 
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where z(p) are scalars of mean 1, u(p) are Gaussian vectors, 
and all these random variables are independent. The ap- 
proximation in equation for a neighborhood of wavelet 
coefficients now reads: 



yj,m,g(» = ^ /(p, l/)Bj (l/)Xj, m , q (p) + Nj, m>g (l/) 



(23) 



Assuming we have observations and P processes, we note 
Y = (y J -,m, g (^i), . . . , yj,m, q (yK)) and Z = (z(l), . . . , z(P)). 
If Z is fixed, we obtain a multicomponent Wiener filter on 
neighborhood of wavelet coefficients: 

K 

E{ X (p)\Y, Z}= z(p)C x(p) J2 f(p,^)B^)G^ k ,y{i/ k ) (24) 

k,k'=l 

where 

p 

Gfc.fe' = y^^(p)/(p^fc)/(PW)Bj(^)C x(p )Bj (i^/) 

P =i 

+^/c=fc / C N ( i/k ) (25) 
The Bayes least square estimate of the full model is: 

£{x(p)|Y} = / p(Z|Y)£{x(p)|Y,Z} dz! dz 2 .. dz P (26) 



with the weights: 
p(Z = (ai)i|Y) = 



p(Y|Z = (ai,..,a P ))n^,(«») 
/p(Y|Z = (/3i,..,/3p))nPz 1 G8 i )dft 



(27) 



where p(Y = y(z/fc)fc|Z = (ai)*) * s a centered Gaussian with 
covariance matrix Gk,k'- 

In practice we compute the Bayes least square estimate 
for each wavelet neighborhood x and only keep the central 
coefficient Xj, m , q for each process. We then reconstruct the 
processes by inverting the wavelet transform. 

In order to compute the estimate, we need the covari- 
ance matrices C N( v) and C x ( p ). Note that these matrices 
depend on the scale and orientation . Since the level of noise 
is assumed to be known, the covariances Cn(^) can be com- 
puted. The covariances for the different processes C x ( p ) are 
estimated from simulated input maps. We use the wavelet 
transform described in (|3.1f> with 4 orientations and 5 scales 
for the ACT experiment, with 4 orientations and 6 scales 
for the Planck experiment. The size of the neighborhoods is 
chosen adaptively at each scale so that the approximation 
in equation is valid (as explained in 13.2. 2f> . Finally, we 
choose the probability distributions p Zi to capture the prop- 
erties of the process x(i, z/q), as noted in the next section. 



3.3 Statistical properties of the signals 

We now must decide which distribution p z we intend to use 
for each signal. In the case where 2 = 1, the Gaussian scale 
mixture described in eq. (fT^J) reduces to a Gaussian process. 
We are considering here the CMB to be Gaussian, and will 
consistently assume z = 1 at all scales for this signal. 

The cluster signal however is typically non- Gaussian. In 
order to model such non-Gaussianity, we will need a more 
elaborate distribution for z, which could in principle be cho- 
sen with or without any particular link to the true distribu- 
tion. In the following we will consider different cases for the 



cluster's z distribution (with and without a physical signifi- 
cance) and we will compare the results. 

In order to compute a physically based model for the 
cluster non-Gaussianity, we analyzed the simulated SZ map. 
For the Gaussian scale mixture model of the signal, it is dif- 
ficult to solve the closed form equation for p z , (the proba- 
bility of z) given the probabilities of x and u. We will in- 
stead study the probability of its logarithm: p\ n z and refer 
to it as the prior. Note that p z can then be easily recov- 
ered: p z (v)dv = p\ nz (u)e u du for u = \nv. Taking the the 
logarithm of eq. (fTlJ) . one obtains the following equation: 

poo 

Pin\x\(v) = / 2 Pim(2y) Pin\u\(y -v) dy (28) 

J 

which we solve given that u is Gaussian and pi n |x| is esti- 
mated from the SZ input maps (details will be given in an 
upcoming paper (Anthoine, . in prep.). 

Together with the physical prior described above, we 
will also consider two other non-physical ones. These are: i) 
the Gaussian prior, which corresponds to assuming 2=1, 
or equivalently p\ nz (v) = Sd(v — 0); ii) the non- informative 
prior, i.e. the uniform probability distribution on In 2, 
P\tlz(v) — constant. The latter has been used for recover- 
ing non-Gaussian signals in digital image processing. We 
will compare the performances of the non informative and 
Gaussian prior to the one that we derived from the input 
maps. 

We plot the distributions described above in Fig. |3| for 
the scale where the SZ signal is most pronounced. The be- 
haviour of p\ nz for small (resp. large) values influences the 
probability of finding small (resp. large) values of \x\. In 
our model, p x describes the statistics of wavelet coefficients. 
Hence, the higher the probability p x for small values of |x|, 
the sparser we expect the signal to be. The more pronounced 
the tails of p x for large \x\ are, the more intense we expect 
the non-zero signal (i.e. the clusters) to be. 

In Fig. 2] we plot the marginal distribution of the 
wavelet coefficients for SZ clusters as computed from the 
map, and show how the different z priors in fig. |3| would fit 
the actual distribution. The right panel shows that the tails 
of the clusters' distribution is much broader than the one of 
the Gaussian. As a consequence, the Gaussian prior tends 
to underestimate the amplitude of clusters. The left panel 
shows that the Gaussian prior underestimates the number 
of wavelets coefficients with small amplitude, while the non 
informative prior and to some extent the profile prior over- 
estimate them. This means that the sparsity of the cluster 
signal is not well described by these distributions. The Gaus- 
sian prior will tend to create more clusters than necessary 
while the non informative prior or the profile will miss clus- 
ters. 

We will also discuss another case that is derived from 
the physical one: a truncated version of the p\ n z estimated 
from the input maps (which we refer to as profile d). This 
version is a trade-off between the profile computed from the 
data and the Gaussian prior. 

To summarize: in the rest of the paper we will compare 
the results given by these four priors: 

a) Gaussian prior, p\ nz (y) = Sd(v — 0), referred to as 
Gaussian. 

b) Non informative prior, pi nz (v) = constant, referred 
to as Uniform. 
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Figure 3. Examples of probability distribution for lnz for a par- 
ticular scale of the wavelet decomposition. Here z is the multi- 
plier in the Gaussian scale mixture model. Plain: Gaussian prior, 
Vlnziy) — 3d(v — 0), z = 1- Dash-dot: non-informative prior, 
Pl n z = constant. Dashed: profile computed from the data. Dashed 
and stars: profile computed from the data truncated (profile d). 

c) Profile computed from the data, referred to as Profile. 

d) Truncated profile computed from the data, referred 
to as Profile d. 



4 RESULTS 

4.1 The reconstructed maps 

We applied the method described above to the ACT and 
Planck simulated maps with realistic beam and noise prop- 
erties (see table Q), with different assumptions on the prior 
Pinz- In figure |5] and |S| we show the input and reconstructed 
y maps for ACT and Planck respectively. 

In these figures we notice that, even using a wavelet ba- 
sis which allows for a local reconstruction, the Gaussian prior 
for p\n z (which causes the estimator to reduce to Wiener fil- 
tering on the neighborhood of wavelet coefficients) underes- 
timates the high-intensity peaks. On the contrary, the profile 
and uniform prior perform (equally) better reconstruct the 
central intensity of the bright clusters, indicating that the 
specific shape of the tail of the prior distribution of multi- 
pliers for high z is not particularly relevant, as long as it 
provides enough power at sufficiently high z values. 

As for the low-intensity clusters, they seem better re- 
constructed in the Wiener filtered maps than by using the 
real profile (or uniform) prior for p\ nz . This indicates that 
our estimate of the true pi n z , the profile, may be weighing 
too heavily low-intensity points. This effect could be caused 
by the procedure we adopt in computing pi nz from the in- 
put maps. Because of our deconvolution technique (see sec- 
tion ea. EB| . it may well be that if the true pi nz had a 
very sharp drop at some low- z value, we wouldn't be able 
to model it accurately. For this reason, we also tried a pro- 
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Figure 4. Top panel: p x , the distribution of x = yfzu. Bottom 
panel: the logarithm of this distribution: ln(p x ). In order of in- 
creasing value of p x at x = (left) and increasing value of \x\ 
for \n(p x ) = — 10 (right). Plain: Gaussian prior, x is Gaussian. 
Dashed and stars: p x corresponding to the profile computed from 
the data truncated (profile d). Plus: distribution p x as numerically 
computed from the input maps. Dashed: p x corresponding to the 
profile computed from the data. Dash-dot: p x corresponding to 
the non-informative prior. 

file that corresponds to the true one truncated at z's lower 
than the peak point (see fig. profile d case) . The trun- 
cated profile performs as well as Wiener filtering in recov- 
ering low-intensity clusters, still improving the results on 
high- intensity ones. 

4.2 The central y parameter 

When it comes to infer cosmological parameters from num- 
ber counts in other wavebands (e.g. X-ray and optical) the 
common practice is to retain only the brightest clusters 
which are less affected by selection effects and have a bet- 
ter characterized scaling function. We shall adopt the same 
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Figure 5. The simulated (top panel) and reconstructed ACT 
cluster maps. The second panel correspond to the Gaussian prior, 
the third to the actual cluster prior profile, and the fourth to the 
truncated cluster prior profile d. The maps is 1.2 deg on a side. 



Figure 6. The original and reconstructed Planck cluster maps. 
The distributions are the same as corresponding ACT panels. 
Note that the color scale here is in the logarithm of the luminosity. 
Border effects are visible: the areas at the edges are disregarded 
in the analysis. The map is 10 deg on a side. 
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Figure 7. The input and output y parameters for bright clusters 
in the ACT experiment. Top plain line: line of perfect reconstruc- 
tion. Gaussian prior: circle, bottom plain line; Profile prior: plus, 
dash-dot line; Profile d prior: cross, dashed line. 







Gaussian 


Uniform 


Profile 


Profile d 


Slope 


ACT 


0.69 


0.84 


0.84 


0.83 




Planck 


0.07 


0.41 


0.43 


0.27 


Spread 


ACT 


0.15 


0.12 


0.12 


0.11 




Planck 


0.35 


0.27 


0.27 


0.31 


Table 2. 


The slope 


and the s 


pread of the the lines 


in figs |7| and 



|8| obtained with different priors p\ n z . 

strategy with SZ clusters, also motivated by the fact that 
they are less affected by reconstruction errors. For these 
clusters it is appropriate to assess how the input y param- 
eter relates to the reconstructed one. In figs. □ and El we 
show the reconstructed y parameter versus the input one 
for ACT and Planck respectively, smoothed over a scale 
which is roughly the smallest beam size in each experi- 
ment. In table we quote the slope of the fitting line for 
the different distributions considered, together with the av- 
erage percentage departure from it, or spread, computed 
as the mean of the ratio: | (input — predicted) /input | with 
predicted = output/slope. In general, using either the uni- 
form or the profile prior for p\ n z improves both the average 
reconstruction (the slope of the curve) and the associated 
error. The actual performance depends on the intensity cut 
applied: here we plot the 50 brightest ACT clusters and the 
12 Planck most extended clusters in the reconstructed maps 
(see section COI for a discussion on selection effects). In the 
case of ACT the average reconstruction improves by about 
20 per cent on the Gaussian prior case when our approach 
is applied using any of the three other distributions. The 
scatter is also reduced by a factor 2. In the case of Planck, 
the improvement in the average reconstruction is roughly a 
factor 5, and the error is also reduced. 

The profile and uniform prior perform very similarly in 
reconstructing the cluster centers, so that we only quote the 
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Figure 8. The input and output y parameters for bright clusters 
in the Planck experiment. Top plain line: line of perfect recon- 
struction. Gaussian prior: circle, bottom plain line; Profile prior: 
plus, dash-dot line; Profile d prior: cross, dashed line. 
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Figure 9. The slope (plain) and spread (dashed) of the fitting 
line for the input/output y parameter when averaged over dif- 
ferent angles C . This plot is constructed using the 50 brightest 
clusters in 24 (1.2 deg) 2 ACT maps. 

reconstruction parameters for the profile prior. It is com- 
forting, however, that the details in the shape of the non- 
Gaussian profile play a minor role in the reconstruction per- 
formance. 

4.3 Performances in the y parameter 
reconstruction 

We now come to the relevant question: which observable 
should be used in order to derive cosmological parameters? 
In particular which should be the angle C over which the y 
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Planck: y max >2.3e-05, y 3 6 >2.2e-05, y 6 >2.1e-05 



Cluster Average diameter 0.2 arcmin, ACT, limit beam to 
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Figure 10. The slope (plain) and spread (dashed) of the fit- 
ting line for the input/output y parameter when averaged over 
different angles C . This plot is constructed using the 12 biggest 
clusters in (10 deg) 2 Planck maps. 



Planck: y max >1 .8e-04, y 3 6 >1.1e-04, y 6 >0.61e-04 
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Figure 11. The slope (plain) and spread (dashed) of the fit- 
ting line for the input/output y parameter when averaged over 
different angles 6. This plot is constructed using the 25 brightest 
clusters in the 10 Planck maps. In this case, the noise has been 
reduced by a factor 7 in all frequency bands. 



parameter should be averaged? In general we expect the an- 
swer to depend on the specific experiment in hand. In order 
to answer this question, we smoothed the input and recon- 
structed maps over several angles, and then we computed 
the slope of the reconstruction and the spread of the points 
around that slope. 

In figures and Q21 we show the slope and spread of 
the fitting as a function of the smearing angle for the ACT 




Original y parameter, x 10 4 



Figure 12. Reconstruction of the y parameter in the ACT ex- 
periment in the idealistic case where the beam is set to zero. The 
profile prior (plus) significantly reduces the spread in the recon- 
structed y parameter with respect to the Gaussian prior (circle). 



and Planck experiments, when the clusters in figs. Q and |H| 
are considered. 

As for the ACT case, we notice a big improvement when 
we reach the highest resolution of the instrument. At this C 
both the slope increases significantly (yielding almost per- 
fect reconstruction: 0.8-0.9) and the spread is greatly re- 
duced. Notice that for ACT the method proposed here re- 
duces the spread by about a factor 2. The overall error in 
the Reconstruction (10 per cent) should not present a major 
impediment in deriving cosmological parameters from this 
sample. 

The beam size is also driving the spread associated with 
the slope, which drops dramatically for C — 0.9. In order to 
better understand the role of the beam, we studied the ide- 
alistic case in which the beam is virtually zero (see fig. P. 
We find that in this limit, the Wiener filter reconstruction 
still shows a spread, while in our proposed model the spread 
is minimal. We conclude that should observational perfor- 
mances improve in the upcoming years, this method would 
become even more interesting. 

Also in the case of Planck (fig. HJJ, we see a significant 
improvement of this method on Wiener filtering in terms of 
the slope and the spread. However, the spread is still about 
30 per cent even for the most extended clusters considered 
here. Note also that these clusters were selected by being 
the biggest in the sample rather than the brightest, because 
some of the very compact bright clusters get completely con- 
fused with noise and do not get reconstructed at all. This 
could be a potential problem when it comes to derive cos- 
mological parameter from them. However, the real perfor- 
mances of the Planck instrument may be better than the 
ones used in this simulation. Moreover, the noise in the sky 
will not be uniformly distributed, in fact some areas will be 
better sampled than others. We assessed the relevance of the 
noise level by performing a similar analysis on the Planck 
maps with a reduced level of noise (about a factor 7 lower). 
The results are displayed in fig. llll for the 25 brightest clus- 
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ters in the Planck reconstructed maps. Here we appreciate 
that the noise is really the limiting factor for Planck: a sig- 
nificantly reduced noise would allow to have almost perfect 
reconstruction of bright clusters with a spread reduced to 10 
per cent. The critical issue now is how big will be the area 
of the sky where this performance will be achieved. Because 
the scanning strategy and the instrument performances are 
not settled yet, this is not a straightforward question to an- 
swer. It is clear, however, that the selection function for the 
Planck cluster catalog is also going to depend on the recon- 
struction method and associated errors in different areas of 
the sky. 

4.4 Completeness and purity of the samples 

Once a cluster map is reconstructed, it is sensible to ask 
whether the structures found really correspond to clusters 
in the input map or not. Furthermore one would like to 
know if a given threshold in the reconstructed map can be 
associated with an input cluster intensity with high confi- 
dence. To this aim, we tried to assess the completeness and 
purity of our samples for given output intensities. In order 
to assess the purity /completeness in ACT, we smoothed the 
maps to 0.9 arcmin and considered all local maxima in the 
reconstructed map which would have a cluster counterpart 
in the input one within a radius of 0.6 arcmin. We use the 
term "purity" to indicate the fraction of the targeted clus- 
ters which do have a counterpart in the input map. We found 
that the purity is one for all relevant intensities, i.e. the re- 
construction doesn't create clusters out of nothing, at least 
in the limit of the approximations applied here (i.e. no point 
sources). We then considered a given intensity in the input 
map, projected it into a reconstructed value using the slope 
calculated before and counted the fraction of objects that 
were effectively reconstructed above such threshold. We use 
the term "completeness" to indicate the fraction of the ini- 
tial clusters that make that threshold in the reconstructed 
map. The sample is complete for clusters with y parameter 
bigger than 3 x 10 -4 (there are about 15 such clusters in a 
100 deg 2 ACT survey). However completeness drops to 50 
per cent for y above 1.5 x 10 -4 (about 150 such clusters in a 
100deg 2 ACT survey). This is because very compact clusters 
are anyway mapped into less intense objects: they would be 
accounted for if we consider the spread associated with the 
reconstruction. 

As for Planck, we still have a very good purity level. 
However, the level of noise considered here prevents us from 
assessing completeness. The dimension of the cluster (rather 
than its intensity) seem to be the dominant factor in select- 
ing the clusters that are reconstructed. This biases the re- 
construction in favour of local clusters, rather than the most 
intense. Hopefully the real noise level will be better than the 
one used here in most areas of the sky, so that a selection 
on the basis of brightness rather than size will be possible. 



into account that clusters produce a non- Gaussian, local- 
ized signal and they are non-spherical, sparse objects on 
the sky. The reconstruction is based in the wavelet domain, 
therefore is local and takes into account covariances between 
different scales, positions and orientations. We studied the 
performances of this estimator with different models of non- 
Gaussianity, some corresponding to the one observed in SZ 
simulated maps and others that are common in image re- 
construction literature. We show that this method outper- 
forms previous techniques like Wiener filtering in recover- 
ing the cluster central intensity both in terms of average 
reconstruction and of associated error. This result mainly 
depends on the better characterization of the bright pixel 
(corresponding to cluster's centers) that we achieve with the 
non- Gaussian prior assumed. The success of this strategy de- 
pends on the combined effect of using a wavelet basis, a local 
reconstruction and an adequate non- Gaussian model for the 
signal. 

We applied our method to two experiments, very dif- 
ferent in nature: Planck and ACT. In the case of ACT, our 
method allows to recover the 80-90 per cent of the central 
intensity of our 50 brightest reconstructed clusters with a 
reconstruction error of about 10 per cent (mainly caused by 
the beam size). The 100 deg 2 ACT survey should contain 
about 150 of such clusters, and the subsample for with high 
completeness level should contain approximately 50 clusters. 
This is an adequate sample to constrain cosmology. 

In the case of Planck we were able to better reconstruct 
the most extended clusters, rather than the brightest. This 
method allows a reconstruction of the 45 per cent of the 
cluster intensity, improving on standard Wiener filtering by 
about a factor five. The error associated with the reconstruc- 
tion for the 12 targeted clusters is, however, still about 27 
per cent. While this is an improvement on the Wiener fil- 
tering, it is somewhat unsatisfactory for deriving cosmology. 
Another limitation derives from the particular selection ef- 
fect: most bright but compact clusters are not reconstructed 
at all. We point out that the major limitation for Planck is 
the noise level, and show that a reduced noise level would 
allow almost perfect reconstruction with very little scatter 
for a sample of intensity-selected clusters. The area of the 
Planck sky where this will be possible depends upon the 
final scan strategy and instrument performance, which are 
not precisely defined yet. In general, we shall expect to have 
very different selection functions which will depend on sky 
positions: a detailed study like the present one will be nec- 
essary to infer actual performance. 

These results may depend heavily on our neglect of 
point sources. It is important to point out, however, that 
the localized and non- Gaussian nature of the point source 
signal will almost certainly affect the standard Wiener-filter 
techniques more adversely than the non- Gaussian estimator 
presented here. We will pursue this more in future research. 



5 CONCLUSIONS 

We investigated the issue of SZ cluster reconstruction in fu- 
ture multi-frequency CMB experiments. We proposed a new 
method for component separation that is specifically tai- 
lored to reconstruct SZ galaxy clusters. Our approach takes 
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